AZ Twitter Sentiment Analysis

Overview

This project analyzes AstraZeneca-related tweets to classify sentiments as Positive, Neutral, or Negative using machine learning. It includes data preprocessing, TF-IDF vectorization, Logistic Regression modeling, and insightful visualizations to guide public sentiment analysis. This project analyzes public sentiment towards AstraZeneca using machine learning techniques on Twitter data. The analysis categorizes tweets into three sentiment classes: Positive, Neutral, and Negative, providing insights into public perception and guiding communication strategies.

Tool Used

Python

The sentiment analysis project successfully categorized tweets related to AstraZeneca into Positive, Neutral, and Negative sentiment categories, providing valuable insights into public perception. The analysis revealed that the majority of tweets were Positive (47.6%) or Neutral (43.6%), indicating an overall favorable or balanced perception of AstraZeneca. However, 8.8% of the tweets reflected Negative sentiments, which highlights areas where public concerns or dissatisfaction may exist and require attention.

The machine learning model, a Logistic Regression classifier, achieved an overall accuracy of 70.4%, with strong performance in identifying Positive and Neutral sentiments. The model's precision, recall, and F1-scores for these categories were satisfactory, demonstrating its ability to classify these sentiments effectively. However, the model struggled to classify Negative sentiments, failing to make any predictions for this category. This shortfall was primarily due to the class imbalance in the dataset, with the Negative sentiment being significantly underrepresented.

The sentiment analysis also highlighted the importance of regularly updating datasets to include newer tweets, ensuring that the model captures evolving trends in public sentiment. Additionally, domain-specific language and context should be considered for better text understanding.